Speech emotion recognition using deep neural network and extreme learning machine

نویسندگان

  • Kun Han
  • Dong Yu
  • Ivan Tashev
چکیده

Speech emotion recognition is a challenging problem partly because it is unclear what features are effective for the task. In this paper we propose to utilize deep neural networks (DNNs) to extract high level features from raw data and show that they are effective for speech emotion recognition. We first produce an emotion state probability distribution for each speech segment using DNNs. We then construct utterance-level features from segment-level probability distributions. These utterancelevel features are then fed into an extreme learning machine (ELM), a special simple and efficient single-hidden-layer neural network, to identify utterance-level emotions. The experimental results demonstrate that the proposed approach effectively learns emotional information from low-level features and leads to 20% relative accuracy improvement compared to the stateof-the-art approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

A hybrid EEG-based emotion recognition approach using Wavelet Convolutional Neural Networks (WCNN) and support vector machine

Nowadays, deep learning and convolutional neural networks (CNNs) have become widespread tools in many biomedical engineering studies. CNN is an end-to-end tool which makes processing procedure integrated, but in some situations, this processing tool requires to be fused with machine learning methods to be more accurate. In this paper, a hybrid approach based on deep features extracted from Wave...

متن کامل

Speech Emotion Recognition Based on Deep Belief Networks and Wavelet Packet Cepstral Coefficients

A wavelet packet based adaptive filter-bank construction combined with Deep Belief Network(DBN) feature learning method is proposed for speech signal processing in this paper. On this basis, a set of acoustic features are extracted for speech emotion recognition, namely Coiflet Wavelet Packet Cepstral Coefficients (CWPCC). CWPCC extends the conventional MelFrequency Cepstral Coefficients (MFCC)...

متن کامل

A New Method for Detecting Ships in Low Size and Low Contrast Marine Images: Using Deep Stacked Extreme Learning Machines

Detecting ships in marine images is an essential problem in maritime surveillance systems. Although several types of deep neural networks have almost ubiquitously used for this purpose, but the performance of such networks greatly drops when they are exposed to low size and low contrast images which have been captured by passive monitoring systems. On the other hand factors such as sea waves, c...

متن کامل

Speech Recognition Using Deep Learning Algorithms

Automatic speech recognition, translating of spoken words into text, is still a challenging task due to the high viability in speech signals. Deep learning, sometimes referred as representation learning or unsupervised feature learning, is a new area of machine learning. Deep learning is becoming a mainstream technology for speech recognition and has successfully replaced Gaussian mixtures for ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014